-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasets more robust to non-string keys #2174
Conversation
@@ -8,6 +8,7 @@ testpaths=xarray/tests | |||
[flake8] | |||
max-line-length=79 | |||
ignore= | |||
W503 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is line break before operator, which has been changed in PEP8
xarray/core/dataarray.py
Outdated
try: | ||
is_coord_key = any([ | ||
isinstance(key, basestring), | ||
key in set(self.dims).union(self.coords) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would hesitate to do this fix -- it means array[2]
could pull out a coordinate if 2
is the name of a dimension. More generally, DataArray.__getitem__
is only meant as a convenient short-cut for getting coordinates, which we might even want to deprecate entirely. DataArray.coords.__getitem__
is the reliable way to get a coordinate out.
Instead, I think the better option would be to avoid making use of **
unpacking below, in the expression self.isel(**self._item_key_to_dict(key))
. This would be a little more involved, but I think the right fix would be to make most of these functions accept a positional dictionaries instead of **kwargs
, as suggested in #1231.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, completely agree, thanks
I added an initial attempt at allowing for supplying a dict; on Let me know ppl's thoughts:
|
👍 |
@shoyer I see you're online - let me know if you have any thoughts re the changes. Am on a burst of open-source work atm |
I'd like a convention of matching names to make it clear in docstrings that these are the same thing, e.g.,
Probably the last is best since users will never have cause to type the longer argument name ending with In practice, I suspect these first arguments are almost always going to be used positionally. We might even imagine making them positional only argument if PEP 570 is ever accepted. So I don't think it matters too much.
Let's raise (like your current implementation) if both are provided. Combining could be error prone and seems unnecessarily complicated. This is the same logic I implemented in the somewhat misleadingly named
I agree -- I think the current version is manageable. For internal use, we should always use the positional argument, but there's not much harm in leaving |
Any others on the 'first priority' list? |
AppVeyor failure unrelated |
xarray/core/dataset.py
Outdated
@@ -1404,12 +1406,16 @@ def isel(self, drop=False, **indexers): | |||
Dataset.sel | |||
DataArray.isel | |||
""" | |||
|
|||
indexers = combine_pos_and_kw_args(indexers, indexers_kwargs, 'isel') | |||
assert isinstance(drop, bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should either be dropped for now, or raise TypeError
. assert
isn't appropriate for external APIs.
xarray/core/dataarray.py
Outdated
"""Conform this object onto a new set of indexes, filling in | ||
missing values with NaN. | ||
|
||
Parameters | ||
---------- | ||
**indexers : dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't start with **
xarray/core/dataarray.py
Outdated
@@ -18,7 +18,9 @@ | |||
from .formatting import format_item | |||
from .options import OPTIONS | |||
from .pycompat import OrderedDict, basestring, iteritems, range, zip | |||
from .utils import decode_numpy_dict_values, ensure_us_time_resolution | |||
from .utils import ( | |||
combine_pos_and_kw_args, decode_numpy_dict_values, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're at it, maybe we should rename combine_pos_and_kw_args
to something clearer, maybe either_dict_or_kwargs
?
xarray/core/dataarray.py
Outdated
values will be filled in with NaN, and any mis-matched dimension | ||
names will simply be ignored. | ||
**indexers_kwargs : {dim: indexer, ...} | ||
The keyword arguments form of ``indexers`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add something like: Only indexers
or indexer_kwargs
may be provided.
Looking through Dataset/DataArray methods that support
|
But I think just supporting this for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me!
xarray/core/dataset.py
Outdated
@@ -1444,6 +1452,15 @@ def sel(self, method=None, tolerance=None, drop=False, **indexers): | |||
|
|||
Parameters | |||
---------- | |||
---------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete
@@ -183,7 +183,7 @@ def is_full_slice(value): | |||
return isinstance(value, slice) and value == slice(None) | |||
|
|||
|
|||
def combine_pos_and_kw_args(pos_kwargs, kw_kwargs, func_name): | |||
def either_dict_or_kwargs(pos_kwargs, kw_kwargs, func_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to test every use, but we should at least add unit test for this helper function -- I don't think we have any test coverage currently.
thanks @maxim-lian ! |
thanks for working through the feedback, as ever, @shoyer ! |
whats-new.rst
for all changes andapi.rst
for new APII don't think this is the most efficient way of doing this, though it does work. Any ideas for a more efficient implementation?